Performance of the relative-rate test under nonstationary models of nucleotide substitution.

نویسندگان

  • N J Tourasse
  • W H Li
چکیده

Relative-rate tests have previously been developed to compare the substitution rates of two sequences or two groups of sequences. These tests usually assume that the process of nucleotide substitution is stationary and the same for all lineages, i.e., uniform. In this study, we conducted simulations to assess the performance of the relative-rate tests when the molecular-clock (MC) hypothesis is true (i.e., there is no rate difference between lineages), but the stationarity and uniformity assumptions are violated. Kimura's and bias-corrected LogDet distances were used. We found that the computation of the variances and covariances of LogDet distances had to be modified, because the constraint that the sum of the frequencies of the 16 nucleotide pair types is equal to 1 must be imposed. Comparison of the rates of two single sequences (Wu and Li's test) or two groups of sequences (Li and Bousquet's test) gave similar results. When the sequences are long (> or = 500 nt), the test based on LogDet distances and their appropriate variances and covariances is appropriate even when the substitution process is not stationary and/or not uniform. That is, at the 5% significance level, the test rejects the MC hypothesis in about 5% of the simulation replicates. In contrast, if the sequences are short (< or = 200 bases) and highly divergent, the LogDet test is very conservative due to overestimation of the variances of the distances. When the uniformity assumption is violated, the relative-rate test based on Kimura's distances can be severely misleading because of differences in base composition between sequences. However, if the uniformity assumption held and so the base frequencies remained similar among sequences, the rate of rejection turned out to be close to 5%, especially with short sequences. Under such conditions, the test using Kimura's distances performs better than the LogDet test. The reason seems to be that these distances are less affected by a reduction in the number of sites than the LogDet distances because they depend on only two parameters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of evolutionary distances under stationary and nonstationary models of nucleotide substitution.

Estimation of evolutionary distances has always been a major issue in the study of molecular evolution because evolutionary distances are required for estimating the rate of evolution in a gene, the divergence dates between genes or organisms, and the relationships among genes or organisms. Other closely related issues are the estimation of the pattern of nucleotide substitution, the estimation...

متن کامل

Evaluation of Ancestral Sequence Reconstruction Methods to Infer Nonstationary Patterns of Nucleotide Substitution.

Inference of gene sequences in ancestral species has been widely used to test hypotheses concerning the process of molecular sequence evolution. However, the approach may produce spurious results, mainly because using the single best reconstruction while ignoring the suboptimal ones creates systematic biases. Here we implement methods to correct for such biases and use computer simulation to ev...

متن کامل

INDELible: A Flexible Simulator of Biological Sequence Evolution

Many methods exist for reconstructing phylogenies from molecular sequence data, but few phylogenies are known and can be used to check their efficacy. Simulation remains the most important approach to testing the accuracy and robustness of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deleti...

متن کامل

Standard Codon Substitution Models Overestimate Purifying Selection for Nonstationary Data

Estimation of natural selection on protein-coding sequences is a key comparative genomics approach for de novo prediction of lineage-specific adaptations. Selective pressure is measured on a per-gene basis by comparing the rate of nonsynonymous substitutions to the rate of synonymous substitutions. All published codon substitution models have been time-reversible and thus assume that sequence c...

متن کامل

Selecting the best-fit model of nucleotide substitution.

Despite the relevant role of models of nucleotide substitution in phylogenetics, choosing among different models remains a problem. Several statistical methods for selecting the model that best fits the data at hand have been proposed, but their absolute and relative performance has not yet been characterized. In this study, we compare under various conditions the performance of different hiera...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 16 8  شماره 

صفحات  -

تاریخ انتشار 1999